-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
[V1] Logits processors extensibility #19912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @afeldman-nm it's looking a lot better now. A few remaining comments but getting close I think.
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @afeldman-nm
@@ -2067,7 +2069,7 @@ def _dummy_sampler_run( | |||
output_token_ids=[[] for _ in range(num_reqs)], | |||
allowed_token_ids_mask=None, | |||
bad_words_token_ids={}, | |||
logitsprocs=LogitsProcessorManager(), | |||
logitsprocs=LogitsProcessors(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether this should contain real LPs since they contain equivalent sampling tensors. Not necessarily needs to be addressed in this PR though.
argmax_invariant: list["LogitsProcessor"] = field( | ||
default_factory=list, init=False) # argmax-invariant logitsprocs | ||
non_argmax_invariant: list["LogitsProcessor"] = field( | ||
default_factory=list, init=False) # non-argmax-invariant logitsprocs | ||
|
||
def __init__( | ||
self, | ||
logitsprocs: Optional[Iterator["LogitsProcessor"]] = None) -> None: | ||
self.argmax_invariant = [] | ||
self.non_argmax_invariant = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these fields need to be declared here
argmax_invariant: list["LogitsProcessor"] = field( | |
default_factory=list, init=False) # argmax-invariant logitsprocs | |
non_argmax_invariant: list["LogitsProcessor"] = field( | |
default_factory=list, init=False) # non-argmax-invariant logitsprocs | |
def __init__( | |
self, | |
logitsprocs: Optional[Iterator["LogitsProcessor"]] = None) -> None: | |
self.argmax_invariant = [] | |
self.non_argmax_invariant = [] | |
def __init__( | |
self, | |
logitsprocs: Optional[Iterator["LogitsProcessor"]] = None) -> None: | |
self.argmax_invariant: list["LogitsProcessor"] = [] | |
self.non_argmax_invariant: list["LogitsProcessor"] = [] |
# Build logits processors. If specified by user, load custom | ||
# logitsprocs constructors. | ||
self.logitsprocs = logitsprocs or LogitsProcessors(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the comment is needed or accurate here anymore?
# Build logits processors. If specified by user, load custom | |
# logitsprocs constructors. | |
self.logitsprocs = logitsprocs or LogitsProcessors(None) | |
self.logitsprocs = logitsprocs or LogitsProcessors() |
SWAP = 1 | ||
|
||
|
||
# (index, params, output_tok_ids, prompt_tok_ids) tuples for new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be a bit more of a logical ordering to have prompt token ids first?
# Every other scenario is a different way of loading a custom logitproc | ||
_run_test(kwargs, logitproc_loaded=False) | ||
return | ||
elif logitproc_source == CustomLogitprocSource.LOGITPROC_SOURCE_ENTRYPOINT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elif logitproc_source == CustomLogitprocSource.LOGITPROC_SOURCE_ENTRYPOINT: | |
if logitproc_source == CustomLogitprocSource.LOGITPROC_SOURCE_ENTRYPOINT: |
|
||
def __init__(self, vllm_config: "VllmConfig", device: torch.device, | ||
is_pin_memory: bool): | ||
self.req_info = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.req_info = {} | |
self.req_info: dict[int, SamplingParams] = {} |
|
||
# (index, params, output_tok_ids, prompt_tok_ids) tuples for new | ||
# requests added to the batch. | ||
AddedRequest = tuple[int, Union[SamplingParams, PoolingParams], list[int], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to support LP for pooling requests? Maybe we should just keep this as SamplingParams
for now?
Purpose
Enable V1 logits processors support to be extended with custom logits processors.
New Python interface for custom logitsprocs
New CLI interface for custom logitsprocs
Test Plan
(WIP)
Test Result
WIP
(Optional) Documentation Update
WIP
Fixes #17799
Fixes #12678